3,767 research outputs found

    Observations on Factors Affecting Performance of MapReduce based Apriori on Hadoop Cluster

    Full text link
    Designing fast and scalable algorithm for mining frequent itemsets is always being a most eminent and promising problem of data mining. Apriori is one of the most broadly used and popular algorithm of frequent itemset mining. Designing efficient algorithms on MapReduce framework to process and analyze big datasets is contemporary research nowadays. In this paper, we have focused on the performance of MapReduce based Apriori on homogeneous as well as on heterogeneous Hadoop cluster. We have investigated a number of factors that significantly affects the execution time of MapReduce based Apriori running on homogeneous and heterogeneous Hadoop Cluster. Factors are specific to both algorithmic and non-algorithmic improvements. Considered factors specific to algorithmic improvements are filtered transactions and data structures. Experimental results show that how an appropriate data structure and filtered transactions technique drastically reduce the execution time. The non-algorithmic factors include speculative execution, nodes with poor performance, data locality & distribution of data blocks, and parallelism control with input split size. We have applied strategies against these factors and fine tuned the relevant parameters in our particular application. Experimental results show that if cluster specific parameters are taken care of then there is a significant reduction in execution time. Also we have discussed the issues regarding MapReduce implementation of Apriori which may significantly influence the performance.Comment: 8 pages, 8 figures, International Conference on Computing, Communication and Automation (ICCCA2016

    HIV/Aids epidemic in India and predicting the impact of the national response: mathematical modeling and analysis

    Get PDF
    After two phases of AIDS control activities in India, the third phase of the National AIDS Control Programme (NACP III) was launched in July 2007. Our focus here is to predict the number of people living with HIV/AIDS (PLHA) in India so that the results can assist the NACP III planning team to determine appropriate targets to be activated during the project period (2007-2012). We have constructed a dynamical model that captures the mixing patterns between susceptibles and infectives in both low-risk and high-risk groups in the population. Our aim is to project the HIV estimates by taking into account general interventions for susceptibles and additional interventions, such as targeted interventions among high risk groups, provision of anti-retroviral therapy, and behavior change among HIV-positive individuals. Continuing the current level of interventions in NACP II, the model estimates there will be 5.06 million PLHA by the end of 2011. If 50 percent of the targets in NACP III are achieved by the end of the above period then about 0.8 million new infections will be averted in that year. The current status of the epidemic appears to be less severe compared to the trend observed in the late 1990s. The projections based on the second phase and the third phase of the NACP indicate prevention programmes which are directed towards the general and high-risk populations, and HIV-positive individuals will determine the decline or stabilization of the epidemic. Model based results are derived separately for the revised HIV estimates released in 2007. We perform a Monte Carlo procedure for sensitivity analysis of parameters and model validation. We also predict a positive role of implementation of anti-retroviral therapy treatment of 90 percent of the eligible people in the country. We present methods for obtaining disease progression parameters using convolution approaches. We also extend our models to age-structured populations

    An Accurate Facial Component Detection Using Gabor Filter

    Get PDF
    Face detection is a critical task to be resolved in a variety of applications. Since faces include various expressions it becomes a difficult task to detect the exact output. Face detection not only play a main role in personal identification but also in various fields which includes but not limited to image processing, pattern recognition, graphics and other application areas. The proposed system performs the face detection and facial components using Gabor filter. The results show accurate detection of facial component

    Understanding the Meaning of “Project Success”

    Full text link
    Fortune 500 organizations are executing their tasks using projects. Project management is the area of concentration across the world. Different stakeholders have a different perspective about project success. The meaning of project success had been explained in this article. In addition, the Project Critical Success Factors (CSFs) were mentioned. The research of Standish Group on project success and project success metrics was presented. Earlier research on the meaning of project success and project critical success factors was highlighted. The works of Jeffery K. Pinto and Dennis P. Slevin, David and Adam, DeLone and McLean, and The Standish Group Research were discussed in this article. The methodology included secondary research based on literature view of prominent empirical studies and the literature reviews by making note of findings and observations from those studies. The initial literature collected led to further search of articles based on their references. The research findings indicate that the top of the most success factors for many projects include project objective, top management commitment, competent project team, and user involvement

    A Novel Machine learning Algorithms used to Detect Credit Card Fraud Transactions

    Get PDF
    During the Covid-19 pandemic, the world was under lockdown, and everyone was inside their home. There are so many restrictions for going out, so many companies introduced online shopping, and this online shopping helped more people; the e-commerce platform also increased their revenue; at the same time, online fraud has also risen worldwide. Everyone adopted online shopping during the pandemic. In 2019 India's 2019 credit/debit card fraud rate was 365, according to the National Crime Record Bureau. The developed countries are the highest rate of credit card fraud in 2020 compared to India; for that reason, we have to implement mechanisms that can detect credit theft. The machine learning algorithm with the R program will play an essential role in credit card fraud detection. The following machine learning algorithm will have used for credit card fraud, Random Forest, Logistic regression, Decision trees, and Gradient Boosting Classifiers. The European bank dataset used in our research and the dataset size is 284808. Here we used two classes, the first one is called the positive class (fraud transactions), and the second one is the negative class (genuine transactions). The final result will show us the outperforms of our proposed system

    Prediction of Success or Failure of Software Projects based on Reusability Metrics using Support Vector Machine

    Get PDF
    In the field of computer science & engineering and software industry the term reusability means usage of existing software assets or previously developed code in the software development process. The assets of the software are products and by-products of the product development life cycle which includes code, test cases, software designs and code documentation. The process of modifying the existing assets as per the need and specific requirements is called leveraging. But the reusability process creates a new version of the existing assets. So always reusability is preferred rather than leveraging. To identify the quality of the reusability of the software components various software metrics are available. But the framework or model that can predict the reusability of the software assets are needed to be developed. The reusability metrics must be identified during the design or coding phase and that can be used to reduce the rework needed develop a similar software module. This can much improve the productivity due to the probabilistic increase in the reuse level. In this study various software metrics representing the software reusability nature of the software components are collected in relation with a particular software project to form a database. The database is divided in to training and test set and Support Vector Machine is trained using the Radial Basis Function (RBF) to predict whether the software component can be reused or not

    Finding Mobile Applications in Cellular Device-to-Device Communications: Hash Function and Bloom Filter-Based Approach

    Get PDF
    The rapid growth of mobile computing technology and wireless communication have significantly increased the mobile users worldwide. We propose a code-based discovery protocol for cellular device-to-device (D2D) communications. To realize proximity based services such as mobile social networks and mobile marketing using D2D communications, each device should first discover nearby devices, which have mobile applications of interest, by using a discovery protocol. The proposed discovery protocol makes use of a short discovery code that contains compressed information of mobile applications in a device. A discovery code is generated by using either a hash function or a Bloom filter. When a device receives a discovery code broadcast by another device, the device can approximately find out the mobile applications in the other device. The proposed protocol is capable of quickly discovering massive number of devices while consuming a relatively small amount of radio resources. We analyze the performance of the proposed protocol under the random direction mobility model and a real mobility trace. By simulations, we show that the analytical results well match the simulation results and that the proposed protocol greatly outperforms a simple non-filtering protoco
    • …
    corecore